20 Gaussian Process
Recall bivariate normal distribution:
We are curious about
How can we visualize
Gaussian Process generalizes this concept to functions
A Stochastic Process is a collection of random variables
A Gaussian Process (GP) is a stochastic process s.t. any finite collection
Moreover, define
- Mean function: can be anything. Popular choices:
. - Covariance function:
- Symmetric
- PSD (positive semi-definitive)
- Stationary:
. - Isotropic:
, is a stronger condition. is an arbitrary norm.
- Positive definite.
- GP is Infinitely differentiable.
Sampling procedure:
- Discretize
as . - Sample
. - For
, , where
Denote.
Samplefrom . So $$\begin{align*}
\mu_{n+1}&=m(x_{n+1})+\vec{C}{n+1}^{\mathrm{T}}K{n}^{-1}\begin{bmatrix}
Y(x_{1})-m(x_{1}) \ \vdots \ Y(x_{n})-m(x_{n})
\end{bmatrix},\
\sigma_{n+1}^{2}&= k(x_{n+1},x_{n+1})-\vec{C}{n+1}^{\mathrm{T}}K{n}^{-1}\vec{C}_{n+1}.
\end{align*}
- !
can be ill conditioned, leading to numerical issues.
A solution: replacewith , where is a constant diagonal matrix. This is similar to the regularization in Ridge regression.
Interpretation: noisy observation. Eachis observed with some additive noise independent of GP: . are called the nugget, which are independent of GP.
Common kernels:
-
RBF Kernel:
-
Matern Kernel:
where . The summation is a polynomial in of degree . So GP with is times differentiable in the "mean square" sense. -
Brownian Motion (Non-stationary):
-
Brownian Bridge (Non-stationary):
This corresponds to a stochastic process conditioned on. -
Ornstein-Uhlenbeck:
This model has applications in physics, biology, finance, etc. -
Linear (Non-stationary): Bayesian Linear Regression. Suppose
, where and . Then for , So the corresponding kernel is -
Prediction:
Training data. How can we predict for a new sample point ? We should use sampling procedure.